Storage path optimization for SANs

ABSTRACT

Embodiments of a system and method for rule-based proactive storage path optimization for SANs. Embodiments may evaluate paths between an application and its storage on a SAN based on current and/or historical path quality of service. Performance of alternative paths may be monitored to determine if a better path than a path currently in use is available. If a better path is determined, then the path may be switched to the better path. In one embodiment, one or more zones may be reconfigured to migrate to a different path. Path migration may be performed automatically without user intervention. Alternatively, a user may be given the option to manually migrate to a new path. Embodiments may proactively change paths between an application and its storage before path performance becomes a problem. Embodiments may be integrated with a SAN management system or, alternatively, may be standalone mechanisms.

BACKGROUND OF THE INVENTION

1. Field of the Invention

This invention is related to the field of storage management and, moreparticularly, to software used in storage management.

2. Description of the Related Art

In the past, large organizations relied heavily on parallel SCSItechnology to provide the performance required for their enterprise datastorage needs. More recently, organizations are recognizing that therestrictions imposed by SCSI architecture are too costly for SCSI tocontinue as a viable solution. Such restrictions include the following:

-   -   SCSI disk arrays must be located no more than 25 meters from the        host server;    -   The parallel SCSI bus is susceptible to data errors resulting        from slight timing discrepancies or improper port termination;        and    -   SCSI array servicing frequently requires downtime for every disk        in the array.

One solution has been to create technology that enables storage arraysto reside directly on the network, where disk accesses may be madedirectly rather than through the server's SCSI connection. Thisnetwork-attached storage (NAS) model eliminates SCSI's restrictive cabledistance, signal timing, and termination requirements. However, it addsa significant load to the network, which frequently is already starvedfor bandwidth. Gigabit Ethernet technology only alleviates thisbottleneck for the short term, so a more elegant solution is desirable.

The storage area network (SAN) model places storage on its own dedicatednetwork, removing data storage from both the server-to-disk SCSI bus andthe main user network. This dedicated network most commonly uses FibreChannel technology, a versatile, high-speed transport. The SAN includesone or more hosts that provide a point of interface with LAN users, aswell as (in the case of large SANs) one or more fabric switches, SANhubs and other devices to accommodate a large number of storage devices.The hardware (e.g. fabric switches, hubs, bridges, routers, cables,etc.) that connects workstations and servers to storage devices in a SANis referred to as a “fabric.” The SAN fabric may enableserver-to-storage device connectivity through Fibre Channel switchingtechnology to a wide range of servers and storage devices. Theversatility of the SAN model enables organizations to perform tasks thatwere previously difficult to implement, such as LAN-free and server-freetape backup, storage leasing, and full-motion video services.

In a SAN environment, a path may be defined as a route through a SANinterconnect through which a SAN application communicates with its SANstorage. Determination and selection of optimum paths from storage toSAN applications using the storage may be difficult to achieve,especially in large SANs. SAN configuration may dynamically change,possibly creating bottlenecks, as a SAN grows. Prior art SAN systems mayprovide mechanisms for static path selection for SAN paths that may leta user select a fixed path manually based on search criteria such as thenumber of hops. These prior art mechanisms do not proactively monitorpath metrics after the manual selection is made, and do not provide theability to automatically determine and switch to better paths as the SANchanges. Thus, it is desirable to provide a mechanism to proactivelyidentify SAN bottlenecks and to reconfigure SAN pathing “on the fly” toimprove the flow of data through the SAN.

SUMMARY OF THE INVENTION

Embodiments of a system and method for rule-based proactive storage pathoptimization for SANs are described. Embodiments may evaluate pathsbetween an application and its storage on a SAN based on current and/orhistorical path quality of service. Performance of two or morealternative paths may be monitored and the quality of service of thepaths may be compared to determine if a better path than a pathcurrently in use is available. If a better path is determined, then thepath between the application and its storage may be switched to thebetter path. One embodiment may be implemented as a storage pathmonitor.

In one embodiment, paths may be defined by zones within the SAN fabric,and fabric zones may be reconfigured to migrate to a different path.Embodiments may use either or both of hard zoning and soft zoning tocontrol paths within a fabric depending upon the user's desiredconfiguration and/or upon which method of zoning the fabric switchvendor(s) support. In one embodiment, path migration may be performedautomatically without user intervention. In another embodiment, a usermay be notified of the better path so that the user may be given theoption to choose whether to migrate to a new path.

In one embodiment, performance metrics may be monitored for two or morealternative paths, and a history of performance for the paths may bedeveloped from the monitored metrics. Optimum path configurations may beselected based on the collected and/or generated performance metrics forthe alternative paths. In one embodiment, one or more selection rulesmay be applied to the collected and/or generated performance metrics forthe alternative paths to determine if a better path between anapplication and its storage than a path currently in use is available.As path statistics and/or performance metrics change, if it isdetermined that a different one of the alternative paths may providebetter quality of service than the current path between an applicationand its storage, the application may be migrated to the different pathbetween the application and its storage to preferably provide betterquality of service for data transfers.

One embodiment may be configured to proactively change paths between anapplication and its storage before path performance becomes a problem.Performance data on path components such as switches may be monitored todetermine the load on alternative paths. If the load on a path currentlyin use is determined to be high (e.g. above a high load threshold), thepath between an application and its storage may be switched to a pathfor which the load is lower. One embodiment may collect and examinehistorical data on path utilization for two or more alternative paths todetermine if there are periods when path performance historically isproblematic. Historical data may be examined to determine if analternative path may provide better quality of service during theproblematic periods. If an alternative path that may provide betterquality of service is identified, migration to the alternative path maybe performed prior to the problematic period. After the period, the pathmay be changed back to the “regular” path.

In one embodiment, a storage path monitor may be integrated with a SANmanagement system. In another embodiment, a storage path monitor may bea standalone module that uses SAN component APIs to monitor the SANcomponents and perform zoning operations to provide alternative pathsbetween an application and its storage. In yet another embodiment, astorage path monitor may be a standalone module that uses SAN componentAPIs to monitor the SAN components and interacts with APIs for one ormore zoning mechanisms of a SAN management system to reconfigure one ormore zones to provide alternative paths between an application and itsstorage.

BRIEF DESCRIPTION OF THE DRAWINGS

The following detailed description makes reference to the accompanyingdrawings, which are now briefly described.

FIG. 1 illustrates an exemplary SAN implementing an embodiment of astorage path monitor according to one embodiment.

FIG. 2 illustrates an exemplary SAN with a host system including astorage path monitor according to one embodiment.

FIGS. 3A and 3B illustrates a storage path monitor switching zonesdefining paths to provide better quality of service between anapplication and its storage according to one embodiment.

FIGS. 4A and 4B illustrates a storage path monitor reconfiguring a zonedefining a path to provide better quality of service between anapplication and its storage according to one embodiment.

FIGS. 5A and 5B illustrate a storage path monitor adding a switch portto a zone supporting load-balanced switch ports to provide betterquality of service between an application and its storage according toone embodiment.

FIG. 6 is a flowchart illustrating a method for storage pathoptimization in SANs according to one embodiment.

FIG. 7 shows an exemplary SAN implementing an embodiment of the SANmanagement system.

FIG. 8 illustrates the architecture of the SAN management systemaccording to one embodiment.

FIG. 9 illustrates the architecture of the SAN access layer according toone embodiment.

FIG. 10 illustrates an exemplary SAN including a SAN management systemand further illustrates the architecture and operation of the SANmanagement system according to one embodiment.

FIG. 11 illustrates LUN binding according to one embodiment.

FIG. 12 illustrates LUN masking according to one embodiment.

FIG. 13 illustrates fabric zoning according to one embodiment.

FIG. 14 illustrates a SAN including a LUN security utility according toone embodiment.

While the invention is described herein by way of example for severalembodiments and illustrative drawings, those skilled in the art willrecognize that the invention is not limited to the embodiments ordrawings described. It should be understood, that the drawings anddetailed description thereto are not intended to limit the invention tothe particular form disclosed, but on the contrary, the intention is tocover all modifications, equivalents and alternatives falling within thespirit and scope of the present invention as defined by the appendedclaims. The headings used herein are for organizational purposes onlyand are not meant to be used to limit the scope of the description orthe claims. As used throughout this application, the word “may” is usedin a permissive sense (i.e., meaning having the potential to), ratherthan the mandatory sense (i.e., meaning must). Similarly, the words“include”, “including”, and “includes” mean including, but not limitedto.

DETAILED DESCRIPTION OF EMBODIMENTS

Embodiments of a system and method for rule-based proactive storage pathoptimization for SANs are described. In a SAN environment, a path may bedefined as a route through a SAN interconnect through which a SANapplication communicates with its SAN storage. Embodiments of a storagepath monitor may evaluate paths on a SAN based on current and/orhistorical path quality of service. Performance of two or morealternative paths between an application and its storage may bemonitored and the quality of service of the paths may be compared todetermine if a better path than a path currently in use is available. Ifa better path is determined, then the path between the application andits storage may be switched to the better path.

In one embodiment, two or more alternative paths from a SAN applicationto target storage may be determined. Embodiments may provide means fordetermining quality of service for each of the determined paths. In oneembodiment, to determine quality of service, performance metrics may bemonitored for the paths, and a history of performance for the paths maybe developed from the monitored metrics. Monitored path performancemetrics may include directly measured performance metrics (e.g. errormetrics and status metrics) and statistical or other performance metricscomputed from directly measured attributes collected over a period. Ingeneral, any path performance metric that may be related to the qualityof service of a path may be monitored. Collected path performancemetrics may be used to generate statistical or other performance metricsfor the path. Optimum path configurations may be selected based on thecollected and/or generated performance metrics for the alternativepaths.

Statistical performance metrics that may be monitored may include one ormore of, but are not limited to, port utilization, total number offrames transmitted, total number of frames received, class 1 framesdropped, class 2 frames dropped, class 3 frames dropped, class 1 framesrejected, class 2 frames rejected, link resets transmitted, class 1, 2 &3 frames received, buffer credit not received, buffer credit notprovided, etc. Error performance metrics that may be monitored mayinclude one or more of, but are not limited to, CRC errors, addresserrors, encoding disparity errors, delimiter errors, frames too long,frames truncated, invalid transmission words, primitive sequenceprotocol errors, etc. Status performance metrics that may be monitoredmay include one or more of, but are not limited to, switch port status,device port status, device status, connectivity status, IP status, linkfailures, synchronization loss detectors, power supply failure, etc.

Collection of performance metrics may be performed using in-band and/orout-of-band mechanisms. In one embodiment, these collection mechanismsmay include in-band mechanisms that may employ logical or physicalconnectivity to collect information from the various hardware componentsof the SAN. In one embodiment, these collection mechanisms may includeout-of-band mechanisms that are independent of the connectivity of thein-band path including one or more of, but not limited to, SNMP, telnetsessions to hardware telnet interfaces and connections to web basehardware interfaces.

Embodiments may provide means for determining an alternative pathpredicted to provide a higher quality of service than a currently usedpath. In one embodiment, one or more selection rules may be applied tothe collected and/or generated performance metrics for the alternativepaths to determine if a better path between an application and itsstorage than a path currently in use is available. As path statisticsand/or performance metrics change, if it is determined that a differentone of the alternative paths that may provide better quality of servicethan the current path between an application and its storage, theapplication may be migrated to the different path between theapplication and its storage to preferably provide higher quality ofservice for data transfers.

In one embodiment, a group of two or more redundant paths that provideredundancy for data transmission between an application and its storagemay be monitored to determine quality of service provided by the paths.Other alternative paths may also be monitored. If quality of service ofone of the redundant paths falls below a quality of service threshold, amonitored alternative path which may provide higher quality of servicethan the redundant path with low quality of service (and not currentlyin the group of redundant paths) may be identified to replace theproblem path in the group of redundant paths to preferably maintain thequality of service requirements for the group of redundant paths.

Embodiments may provide means for changing the paths so that anapplication accesses its storage via an alternative path to preferablyprovide higher quality of service for data transfers. In one embodiment,paths may be defined by zones within the SAN fabric, and fabric zonesmay be reconfigured to migrate to a different path. While embodimentsare generally described herein as using zoning to modify and/or createpaths, it is noted that other embodiments may use other mechanisms thanzoning for modifying existing paths and/or creating new paths to migrateto different paths. Some embodiments may use combinations of two or moremechanisms to create and/or modify paths. In general, embodiments mayuse any mechanism available in a SAN to modify and/or create paths. Inone embodiment, path migration may be performed automatically withoutuser intervention. In another embodiment, a user may be notified of thebetter path so that the user may be given the option to choose whetherto migrate to a new path. In one embodiment, a storage path monitor maybe configured to allow either or both user notification of and switchingto determined better paths and automatic switching to determined betterpaths.

One embodiment of a system and method for rule-based proactive storagepath optimization for SANs may be implemented as a storage path monitoron one or more systems coupled to a SAN. FIG. 1 illustrates an exemplarySAN implementing an embodiment of a storage path monitor according toone embodiment. For one embodiment, SAN may be described as ahigh-speed, special-purpose network that interconnects storage devices304 (e.g. storage devices 304A, 304B, and 304C) with associated dataservers (e.g. hosts 302A, 302B, and 302C) on behalf of a larger networkof users. A SAN may employ Fibre Channel technology. A SAN may includeone or more hosts 302 (e.g. hosts 302A, 302B, and 302C), one or morestorage devices 304 (e.g. storage devices 304A, 304B, and 304C), and oneor more SAN fabrics 318. One or more end-user platforms (not shown) mayaccess the SAN, typically via a LAN or WAN connection to one or more ofthe hosts 302.

Storage devices 304 may include one or more of, but are not limited to,RAID (Redundant Array of Independent Disks) systems, disk arrays, JBODs(Just a Bunch Of Disks, used to refer to disks that are not configuredaccording to RAID), tape devices, and optical storage devices. Hosts 302may run any of a variety of operating systems, including, but notlimited to: Solaris 2.6, 7, 8, 9, etc.; Linux; AIX; HP-UX 11.0b, 11i,etc.; Microsoft Windows NT 4.0 (Server and Enterprise Server) andMicrosoft Windows 2000 (Server, Advanced Server and DatacenterEditions). Each host 302 is typically connected to the fabric 318 viaone or more Host Bus Adapters (HBAs). SAN fabric 100 may enableserver-to-storage device connectivity through Fibre Channel switchingtechnology. SAN fabric 318 hardware may include one or more fabriccomponents (e.g. switches 308, bridges 310, hubs 312, or other devices314 such as routers), as well as the interconnecting cables (for FibreChannel SANs, fibre optic cables).

Host systems 302 may include one or more SAN applications 320 such asSAN application 320A on host 302A and SAN application 320B on host 302C.One or more host system 302 may each include an instance of the storagepath monitor 300; in this example host 302B includes an instance ofstorage path monitor 300.

FIG. 2 illustrates an exemplary SAN with a host system including astorage path monitor according to one embodiment. Host systems 302 maybe any of various types of devices, including, but not limited to, apersonal computer system, desktop computer, laptop or notebook computer,mainframe computer system, workstation, network appliance, networkcomputer, Internet appliance, or other suitable device. Host system 302Bmay include at least one processor 322. The processor 322 may be coupledto a memory 324. Memory 324 is representative of various types ofpossible memory media, also referred to as “computer readable media.”Hard disk storage, floppy disk storage, removable disk storage, flashmemory and random access memory (RAM) are examples of memory media. Theterms “memory” and “memory medium” may include an installation medium,e.g., a CD-ROM or floppy disk, a computer system memory such as DRAM,SRAM, EDO RAM, SDRAM, DDR SDRAM, Rambus RAM, etc., or a non-volatilememory such as a magnetic media, e.g., a hard drive or optical storage.The memory medium may include other types of memory as well, orcombinations thereof. Host system 302B may include, in memory 324, astorage path monitor 300.

Host system 302B may couple to one or more SAN components such as otherhosts 302, storage devices 304, backup devices 330, fabric componentsincluding switches 308, and other SAN components via network interface332. Network interface 332 may include one or more network connectionsto one or more different types of communications networks. Storage pathmonitor 300 may monitor components of one or more paths 340 such as path340A and path 340B via one or more in-band and/or out-of-band networkconnections. Host system 302B may couple to the SAN components via oneor more out-of-band network connections (e.g. Ethernet, LAN, WAN orother network connections). Host system 302B may also couple to the SANcomponents via one or more in-band network connections. In-band refersto transmission of a protocol other than the primary data protocol overthe same medium (e.g. Fibre Channel) as the primary data protocol of theSAN. Out-of-band refers to transmission of information among SANcomponents outside of the Fibre Channel network, typically overEthernet, on a LAN, WAN, or other network. Host system 302B may alsocouple to one or more storage devices 304 via Fibre Channel through theSAN fabric for SAN data transmission using the primary data protocol.

In one embodiment, more than one host system 302 may include instancesof storage path monitor 300. While this example illustrates the storagepath monitor 300 on host system 302B of the SAN, in some embodiments,the storage path monitor may reside on a non-host (e.g. end-user) systemcoupled to the SAN via a LAN or WAN connection to one or more of thehost systems 302.

An instance of storage path monitor 300 may determine and monitor two ormore alternative paths 340 (e.g. paths 340A and 340B) on the SAN betweenan application 320 and its storage to collect path performance metricsfrom components of the path(s) 340. Path components that may bemonitored may include one or more of, but are not limited to, Host BusAdapters (HBAs), HBA ports, switches 308, switch ports, hubs 312,bridges 310, LUNs, storage device ports, and in general any componentthat may be part of a path between an application and its storage. Inone embodiment, storage path monitor 300 may monitor SAN components bycommunicating with the SAN components via one or more in-band and/orout-of-band communication channels. In one embodiment, storage pathmonitor 300 may generate and store information describing the path(s)340 and indicating the member components in the path(s) 340. In oneembodiment, this information may be stored in one or more databasetables.

For each SAN component on a path, storage path monitor 300 may monitorone or more component performance metrics. Monitored componentperformance metrics may include one or more of, but are not limited to,throughput, bytes transferred, error rates, frames dropped, etc.Monitored component performance metrics may include directly measuredperformance metrics (e.g. throughput, bytes transferred, frames dropped,etc.) and statistical or other performance metrics computed fromdirectly measured attributes collected by the component over a period(e.g. error rates, frame rates, etc.) In general, any componentperformance metric that may be related to the quality of service of apath may be monitored. In one embodiment, storage path monitor 300 maygenerate statistical or other performance metrics from the collectedcomponent performance metrics. For example, a particular performancemetric may be collected from a particular component of a path over aperiod and used to generate a mean or median for the performance metricover the period. As another example, a performance metric may becollected from two or more path components and used to generate acombined performance metric for the two or more components. As yetanother example, a ratio of two separately collected performance metricsmay be generated.

Storage path monitor 300 may compare the collected and/or generatedperformance metrics for the alternative paths to determine if there is abetter path available based on quality of service than the pathcurrently in use between an application and its storage. As pathstatistics and/or performance metrics change, if the storage pathmonitor 300 determines an alternative path that may provide betterquality of service than the current path between an application and itsstorage, the application may be migrated to the different path betweenthe application and its storage to preferably provide better quality ofservice for data transfers. In one embodiment, path migration may beperformed automatically without user intervention. In anotherembodiment, a user may be notified of the better path so that the usermay be given the option to choose whether to migrate to a new path. Inone embodiment, storage path monitor 300 may be configured to alloweither or both user notification of and switching to determined betterpaths and automatic switching to determined better paths.

In one embodiment, one or more selection rules may be applied to thecollected and/or generated performance metrics for the alternative pathsto determine if a better path between the application and its storagethan a path currently in use is available. In one embodiment, theselection rules may compare the performance metrics to one or morethresholds for the performance metrics to determine relative quality ofservice for the alternative paths. In one embodiment, there may be aquality of service low threshold that may be used by the selection rulesto identify paths currently in use that have fallen below the quality ofservice low threshold. In one embodiment, there may be a quality ofservice high threshold that may be used by the selection rules toidentify alternative paths to the current path. If, for the currentpath, one or more of the performance metrics are exceeding thresholdsfor the performance metrics that indicate quality of service for thecurrent path may be adversely affected or is being adversely affected,storage path monitor 300 may look for an alternative path that may offerhigher quality of service than the current path. In one embodiment,storage path monitor 300 may attempt to identify an alternative path forwhich the predicted quality of service is above a high quality ofservice threshold. This may preferably prevent switching to analternative path that may only provide marginal improvement in qualityof service over the current path. In one embodiment, the path may bedefined by a fabric zone, and the zone may be reconfigured to use theswitch port with the lower traffic rate.

In one embodiment, storage path monitor 300 may perform pathmodification if quality of service for the path stays below a quality ofservice low threshold for the path for a given time period. In oneembodiment, the time period may be set to range from 0 (to causeimmediate path modification when the quality of service falls below thelow threshold) to a duration of seconds, minutes, hours, days, etc. Inone embodiment, the quality of service for a path may be averaged over agiven time period, and the average compared to the quality of servicelow threshold to determine if path modification may be performed. In oneembodiment, the computed average for the quality of service may berequired to remain below the quality of service low threshold for agiven time period for path modification to be performed. Similarly, inone embodiment, storage path monitor 300 may select alternative pathsthat have maintained a high quality of service for a given time periodto replace existing paths with low quality of service. In oneembodiment, if two or more alternative paths have maintained a highquality of service (e.g. above a high quality of service threshold), analternative path that has maintained high quality of service for thelongest time period may be selected. Other embodiments may use othermethods to select from among alternative paths.

As an example of applying selection rules to the collected and/orgenerated performance metrics for the alternative paths to determine ifa better path between the application and its storage than a pathcurrently in use is available, one embodiment may look at utilization ofthe path and/or the components of the path. If the utilization reachesor passes a certain percentage of the total throughput possible for thepath or a component of the path (i.e. reaches or passes a highutilization threshold), and an alternative path with lower utilizationis found, storage path monitor 300 may switch usage to the alternativepath. In one embodiment, storage path monitor 300 may only switch usageto an alternative path if an alternative path with utilization below alow utilization threshold is found. In one embodiment, storage pathmonitor 300 may attempt to identify and switch to an alternative pathonly if the utilization stays at or above a certain percentage of thetotal throughput possible for the path or a component of the path for agiven time period (e.g. a given number of seconds, minutes, days, etc.).In one embodiment, storage path monitor 300 may maintain an averageutilization for the current path over a given time period, and mayattempt to identify and switch to an alternative path only if theaverage utilization stays at or above a certain percentage of the totalthroughput possible for the path or a component of the path for a giventime period. For example, if traffic rate on a particular switch port isgreater than 90% of the maximum throughput allowed (e.g. above a hightraffic rate threshold), then storage path monitor 300 may attempt todetermine and switch to an alternative path that uses a switch port thathas a lower traffic rate (e.g., below a low traffic rate threshold of,for example, 10%). As another example, if traffic rate on a particularswitch port stays above a high traffic rate threshold for a given timeperiod (e.g. 30 seconds), then storage path monitor 300 may attempt todetermine and switch to an alternative path that uses a switch port thathas a lower traffic rate (e.g., below a low traffic rate threshold of,for example, 10%).

Storage path monitor 300 may monitor path performance and proactivelytune the path performance based on quality of service statistics. In oneembodiment, storage path monitor 300 may be configured to proactivelychange paths between an application and its storage before pathperformance becomes a problem. Performance data on path components suchas switches may be monitored by storage path monitor 300 to determinethe load on alternative paths. If the load on a path currently in use isdetermined to be high (e.g. above a high load threshold), storage pathmonitor 300 may switch to a path for which the load is lower (e.g. belowa low load threshold). In one embodiment, storage path monitor 300 maycollect and examine historical data on path utilization for two or morealternative paths to determine if there are periods (e.g. days of theweek, hours of the day, etc.) when path performance historically isproblematic. Storage path monitor 300 may examine the historical data todetermine if an alternative path may provide better quality of serviceduring the problematic periods. If an alternative path that may providebetter quality of service is identified, storage path monitor 300 mayschedule migration to the alternative path prior to the problematicperiod. After the period, storage path monitor 300 may change the pathback to the “regular” path. For example, if one path between anapplication and its storage is determined to have a high load rate for aparticular period (e.g. on a particular day), and another path isdetermined to have a low load rate for the same period, storage pathmonitor 300 may move data transmission between the application and itsstorage to the low traffic path for the period.

The following describes an exemplary scenario for proactively usingembodiments of a storage path monitor to improve quality of service in aSAN and is not intended to be limiting. Storage path monitor 300 maymonitor traffic on two or more paths in a SAN between an application andits storage. If it is determined that a path used by a criticalapplication has high throughput on a certain day of the week, storagepath monitor 300 may schedule a path migration to a lower-utilized pathbefore that day. Storage path monitor 300 may then perform the migrationwhen scheduled, and may revert back to the original “regular” path afterthe day is over.

In one embodiment, paths may be defined by zones in the SAN fabric. FIG.3A illustrates an exemplary SAN with zones defining paths according toone embodiment. A zone 394 is a set of objects within a SAN fabric thatcan access one another. By creating and managing zones 394, host accessto storage resources may be controlled. Zoning-enabled fabrics mayinclude zoning tables that define each zone along with its memberobjects. In one embodiment, zones 394 and their member objects may bedefined in zoning tables within the switches (e.g. switch 308) on theSAN fabric. When zoning is implemented on a SAN fabric, the switches(e.g. switch 308) consult the zoning table to determine whether oneobject is permitted to communicate with another object, and restrictaccess between them unless they share a common membership in at leastone zone 394. Fabric zoning occurs at the level of individual nodes orports attached to the SAN fabric.

Zoning may be performed using one or both of soft zoning and hard zoningin a fabric. Soft zoning, also called advisory zoning, may be enforcedsimply by filtering the visibility of objects on the SAN so that anobject can only see other objects that share at least one zonemembership with the object. In hard zoning, a Fibre Channel switch 308may actively block access to zone members from any objects outside thezone. This may be performed at the level of ports 390 on the switch 308.Hard zoning may also be referred to as switch port zoning. Embodimentsof storage path monitor 300 may use either or both of hard zoning andsoft zoning to control paths within a fabric depending upon the user'sdesired configuration and/or upon which method of zoning the fabricswitch vendor(s) support.

A fabric may include more than one zone 394, and two or more zones mayprovide alternative paths between an application and its storage. InFIG. 3A, two zones 394A and 394B are shown as alternative paths betweenapplication 320 on host 302 and the application's storage on storagedevice 304. In this example, zone 394A includes HBA port 392A on host302, switch port 390A on switch 308, and port 386A on storage device304. Zone 394B includes HBA port 392B on host 302, switch port 390B onswitch 308, and port 386B on storage device 304. Note that zones mayinclude more than one of HBA ports 392, switch ports 390, and storagedevice ports 386. In one embodiment, storage path monitor 300 maydetermine that an alternative path defined by another zone 394 mayprovide better quality of service than a path defined by the zone 394currently in use, and the application 320 may be migrated to use thealternative path in the other zone 394. For example, zone 394A may becurrently in use as the path for data transmission between application320 and its storage on storage device 304. Storage path monitor 300 maymonitor performance metrics of components in both zones 394A and 394B.The monitored performance metrics and one or more generated performancemetrics may be applied to one or more selection rules to determine thequality of service of the two paths defined by zones 394A and 394B. Ifit is determined that the path defined by zone 394B may offer betterquality of service than the path defined by zone 394B, then storage pathmonitor 300 may switch application 320 to use the alternative pathdefined by zone 394B for data transmission between application 320 andits storage on storage device 304 as illustrated in FIG. 3B.

In one embodiment, fabric zones may be reconfigured to migrate to analternative path that may provide better quality of service. In thisembodiment, fabric components may be rezoned to force migration to analternative path between an application and its storage that may providebetter quality of service than a path currently in use. One or morecomponents (e.g. switch ports) may be removed and/or added to one ormore existing zones, or alternatively a new zone may be created, toprovide the alternative path between the application and its storage.

FIG. 4A illustrates an exemplary SAN with a zone defining a pathaccording to one embodiment. In this example, zone 394 includes HBA port392 on host 302, switch port 390A on switch 308, and port 386 on storagedevice 304. In one embodiment, storage path monitor 300 may determinethat reconfiguring zone 394 may provide better quality of service thanis currently provided by the path defined by the zone 394. For example,zone 394 may be currently in use as the path for data transmissionbetween application 320 and its storage on storage device 304. Storagepath monitor 300 may monitor performance metrics of components in zone394 (e.g. HBA port 392, switch port 390A, and port 386) as well as othercomponents such as switch port 390B. The monitored performance metricsand one or more generated performance metrics may be applied to one ormore selection rules to determine the quality of service of the pathdefined by zone 394 as well as one or more alternative paths which maybe defined by modifying zone 394, such as a path including HBA port 392,switch port 390B, and port 386. If it is determined that an alternativepath including HBA port 392, switch port 390B, and port 386 may offerbetter quality of service than the path currently defined by zone 394,then storage path monitor 300 may replace switch port 390A with switchport 390B in zone 394 to use the alternative path defined by modifiedzone 394 for data transmission between application 320 and its storageon storage device 304 as illustrated in FIG. 4B.

Some fabric components (e.g. switches) may perform load balancingbetween two or more ports in a zone. Data transmitted on a path definedby this zone between an application and its storage may pass throughthese ports as determined by the load balancing mechanism of the fabriccomponent. FIG. 5A illustrates an exemplary SAN with a zone defining apath and including load-balanced switch ports according to oneembodiment. Zone 394 may be currently in use as the path for datatransmission between application 320 and its storage on storage device304. Zone 394 may include HBA port 392 on host 302, load-balanced switchports 390A and 390B on switch 308, and port 386 on storage device 304.Storage path monitor 300 may monitor performance metrics of componentsin zone 394 (e.g. HBA port 392, load-balanced switch ports 390A and390B, and port 386), as well as other components such as switch port390C. In one embodiment, storage path monitor 300 may monitor the loadbalanced switch ports 390A and 390B and, for example, may determine thatutilization of the load balanced ports is exceeding a throughputthreshold, for example by applying monitored port performance metrics toone or more selection rules. In one embodiment, storage path monitor 300may reconfigure the zone 394 by adding one or more additional switchports to the load balanced ports 394. In this example, storage pathmonitor 300 may add switch port 390C to zone 394, as illustrated in FIG.5B. After reconfiguration, some or all of the data transmitted on thepath defined by zone 394 between application 320 and its storage maypass through the added switch port(s) as determined by the loadbalancing mechanism of the switch 308 to preferably provide betterquality of service on the path defined by zone 394.

A zone may include alternative paths between an application and itsstorage. For example, two or more ports of a fabric component (e.g. aswitch) may be included in a zone. Each of the ports may represent adifferent path between an application and its storage. Some fabriccomponents (e.g. switches) may provide an API through which the fabriccomponent may be directed to use a particular port for a path. In oneembodiment, storage path monitor 300 may determine that an alternativepath in a zone through a first port on a fabric component may providebetter quality of service than a currently used path through a secondport on the fabric component. If this fabric component provides an APIthrough which the fabric component may be directed to use a particularport for a path, storage path monitor 300 may direct the fabriccomponent to use the first port to provide the alternative path betweenthe application and its storage.

FIG. 6 is a flowchart illustrating a method for storage pathoptimization in SANs according to one embodiment. As indicated at 400, aplurality of paths between an application on a host system and itsapplication data on a storage device in a SAN may be determined. Theapplication may currently access the application data via one of theplurality of paths. As indicated at 402, performance metrics (e.g.throughput, bytes transferred, error rates, frames dropped, etc.) of theplurality of paths may be monitored. In one embodiment, to monitorperformance metrics of the plurality of paths, one or more performancemetrics of one or more components of each path may be monitored. Asindicated at 404, quality of service of each of the paths may bedetermined from the monitored performance metrics. As indicated at 406,an alternative one of the plurality of paths predicted to provide ahigher quality of service than the current path may be determined. Inone embodiment, to determine the alternative path, one or more selectionrules may be applied to the performance metrics of each of the monitoredpaths. As indicated at 408, the paths may be changed so that theapplication accesses the application data via the alternative path topreferably provide a higher quality of service than the current path. Inone embodiment, the paths may be changed to the alternative path if thecurrent path is performing (or, alternatively, is predicted to perform)below a quality of service low threshold. In one embodiment, the pathsmay be changed to the alternative path if the current path is performing(or, alternatively, is predicted to perform) below a quality of servicelow threshold and the alternative path is predicted to perform above aquality of service high threshold. In one embodiment, the paths aredefined by zones on the SAN fabric, and, to change the paths, one ormore of the zones may be reconfigured so that the application accessesthe application data via the alternative path.

One embodiment of a system and method for rule-based proactive storagepath optimization for SANs may be implemented in a SAN management systemsuch as the exemplary SAN management system described below. In oneembodiment, a storage path monitor may be implemented in a SANmanagement server such as the SAN management server of the exemplary SANmanagement system described below. In one embodiment, a SAN managementsystem such as the exemplary SAN management system described below maydiscover information for SAN components including, but not limited to,hosts, storage devices, and fabric devices (e.g. switches), and thediscovered information may be accessed by the storage path monitor andused in determining paths to monitor, components of paths to bemonitored, and performance metrics of the components to be monitored. Inone embodiment, path information may be stored in and accessed from adatabase of a SAN management system, for example, a SAN access layerdata store of the exemplary SAN management system described below.

In one embodiment, a SAN management system such as the exemplary SANmanagement system described below may collect usage and otherperformance-related metrics from the path components that the storagepath monitor is monitoring, for example using collectors of a SAN accesslayer of the exemplary SAN management system described below, andprovide the collected performance metrics to the storage path monitor.In one embodiment, a SAN management system such as the exemplary SANmanagement system described below may store the collected performancemetrics in a database, and the storage path monitor may access thedatabase to obtain the desired performance metrics. In one embodiment, aSAN management system such as the exemplary SAN management systemdescribed below may generate historical performance information in adatabase, which may be accessed by the storage path monitor to performanalysis of historical quality-of-service performance. In oneembodiment, the storage path monitor may then use the collected pathperformance metrics to generate statistical or other performance metricsfor the paths being monitored. Optimum path configurations may beselected based on the collected and/or generated performance metrics forthe alternative paths. In one embodiment, one or more selection rulesmay be applied to the collected and/or generated performance metrics forthe alternative paths to determine if a better path between anapplication and its storage than a path currently in use is available.

In one embodiment, a SAN management system such as the exemplary SANmanagement system described below may provide one or more mechanisms formanaging and configuring zones, such as a zone utility and a LUNsecurity utility, both described below for the exemplary SAN managementsystem. In one embodiment, storage path monitor may interact with APIsfor one or more of these zoning mechanisms to reconfigure one or morezones to provide an alternative path between an application and itsstorage that may provide better quality of service than a path currentlyin use.

In one embodiment, rather than automatically reconfiguring zones,storage path monitor may inform a user of the SAN management system ofan alternative path that may provide better quality of service than apath currently in use and thus allow the user to decide whether toswitch to the alternative path. In one embodiment, to inform the user,storage path monitor may send a message to a management console such asthe SAN manager of the exemplary SAN management system described below.In one embodiment, the user may then instruct storage path monitor toreconfigure the one or more zone(s) to provide the alternative pathbetween an application and its storage. Alternatively, the user may useone or more zoning mechanisms provided by the SAN management system tomanually reconfigure the zone(s), if desired.

In one embodiment, storage path monitor may be integrated with a SANmanagement system such as the exemplary SAN management system describedbelow. In another embodiment, storage path monitor may be a standalonemodule that uses SAN component APIs (such as fabric switch APIs) tomonitor the SAN components and perform zoning operations to providealternative paths between an application and its storage. In yet anotherembodiment, storage path monitor may be a standalone module that usesSAN component APIs (such as fabric switch APIs) to monitor the SANcomponents and interacts with APIs for one or more zoning mechanisms ofa SAN management system such as the exemplary SAN management systemdescribed below to reconfigure one or more zones to provide alternativepaths between an application and its storage.

While embodiments are generally described herein in regards to SANs andSAN applications, it is noted that embodiments may be implemented inother network environments to provide service-level monitoring forscheduled data transfer tasks in those environments.

SAN Management System

Embodiments of a centralized Storage Area Network (SAN) managementsystem are described. FIG. 7 shows an exemplary SAN implementing anembodiment of the SAN management system. For one embodiment, SAN may bedescribed as a high-speed, special-purpose network that interconnectsstorage devices 104 (e.g. storage devices 104A, 104B, and 104C) withassociated data servers (e.g. hosts 102A, 102B, and 102C) on behalf of alarger network of users. A SAN may employ Fibre Channel technology. ASAN may include one or more hosts 102 (e.g. hosts 102A, 102B, and 102C),one or more storage devices 104 (e.g. storage devices 104A, 104B, and104C), and one or more SAN fabrics 100. A SAN may also include one ormore administration systems 106. One or more end-user platforms (notshown) may access the SAN, typically via a LAN or WAN connection to oneor more of the hosts 102.

Storage devices 104 may include, but are not limited to, RAID (RedundantArray of Independent Disks) systems, disk arrays, JBODs (Just a Bunch OfDisks, used to refer to disks that are not configured according toRAID), tape devices, and optical storage devices. Hosts 102 may run anyof a variety of operating systems, including, but not limited to:Solaris 2.6, 7, 8, 9, etc.; Linux; AIX; HP-UX 11.0b, 11i, etc.;Microsoft Windows NT 4.0 (Server and Enterprise Server) and MicrosoftWindows 2000 (Server, Advanced Server and Datacenter Editions). Eachhost 102 is typically connected to the fabric 100 via one or more HostBus Adapters (HBAs). SAN fabric 100 may enable server-to-storage deviceconnectivity through Fibre Channel switching technology. SAN fabric 100hardware may include one or more switches 108, bridges 110, hubs 112, orother devices 114 such as routers, as well as the interconnecting cables(for Fibre Channel SANs, fibre optic cables).

Embodiments may simplify and centralize the management of heterogeneousSANs to enable control of SAN resources including, but not limited to,logical volumes, fibre channel adapters, and switches 108, as well asstorage devices 104. A logical volume is a virtual disk made up oflogical disks. A logical disk (also referred to as a logical device) isa set of consecutively addressed FBA (Fixed Block Architecture) diskblocks that is part of a single virtual disk-to-physical disk mapping.Logical disks are normally not visible to the host environment, exceptduring array configuration operations. A virtual disk is a set of diskblocks presented to an operating environment as a range of consecutivelynumbered logical blocks with disk-like storage and I/O semantics. Thevirtual disk is the disk array object that most closely resembles aphysical disk from the operating environment's viewpoint.

Embodiments may provide centralized management of SAN-connected deviceswith automatic discovery, visualization, access control, andpolicy-based monitoring, alerting and reporting. Embodiments may providea single point of management from logical unit to interconnect toSAN-connected hosts 102. A LUN (logical unit number) is the SCSI (SmallComputer System Interface) identifier of a logical unit within a target,the system component that receives a SCSI I/O command. A logical unit isan entity within a SCSI target that executes I/O commands. SCSI I/Ocommands are sent to a target and executed by a logical unit within thattarget. A SCSI physical disk typically has a single logical unit. Tapedrives and array controllers may incorporate multiple logical units towhich I/O commands can be addressed. Each logical unit exported by anarray controller may correspond to a virtual disk. An interconnect is aphysical facility by which system elements are connected together andthrough which they can communicate with each other (e.g. I/O buses andnetworks.)

Embodiments may provide data-centric management from host applicationsthrough interconnects to the storage resources, regardless of theunderlying hardware and operating system(s). SAN management may occur atphysical and logical levels to maintain control regardless of theunderlying device environment. With the discovery of host attributeslike OS platform, OS handles and IP address, the critical linkassociating logical devices to a host 102 and its applications may bemade.

One embodiment may include a SAN management server 200 and one or moreSAN managers 202. SAN management server 200 may discover SAN objects andtheir attributes, and may provide event management, policy management,and/or notification services. SAN management server 200 may explore theSAN to make data available to client applications, including SAN manager202. SAN management server 200 may run in a variety of operating systemsincluding, but not limited to: Solaris 2.6, 7, 8, 9, etc.; Linux; AIX;HP-UX 11.0b, 11i, etc.; Microsoft Windows NT 4.0 (Server and EnterpriseServer) and Microsoft Windows 2000 (Server, Advanced Server andDatacenter Editions). One embodiment may include an integrated volumemanager that may provide capabilities including, but not limited to,pooling storage across multiple heterogeneous arrays on the SAN. The SANmanagement system may automatically discover and display volumes withinits interface. Additionally, adding storage to a host may be streamlinedthough the SAN management system. In one embodiment, when zoning storageto a host, an operating system rescan may be automatically initiated sothat the new device is immediately available for use by the volumemanager on the host.

Embodiments may reduce or eliminate the manual task of tracking ofdevices and their connections in the SAN by automatically discoveringthe physical and logical connections of the SAN, displaying theinformation in a graphical topology map and logging the data in avariety of inventory reports. One embodiment may enable the automaticdiscovery of SAN resources using one or more in-band and/or out-of-bandprotocols and industry standards (e.g. MS/CT, GS-3, SNMP, Fibre AllianceMIB, ANSI T11, SCSI, CIM (Common Information Model), vendor-specificextensions, etc.). Using both in-band and out-of-band protocols, andleveraging industry standards, the SAN management system mayautomatically capture and display details, including, but not limitedto, device driver version, firmware level, status, performance, free andin-use port count, hardware manufacturer, model number and worldwidename (WWN). In-band refers to transmission of a protocol other than theprimary data protocol over the same medium (e.g. Fibre Channel) as theprimary data protocol. Out-of-band refers to transmission of managementinformation for Fibre Channel components outside of the Fibre Channelnetwork, typically over Ethernet. In one embodiment, a storageadministrator may assign customized attributes to devices in the SAN foruse in tracking information such as physical location, account code,installation date and asset tag number.

SAN manager 202 may provide a central management interface for variousSAN management tasks, and may provide a graphical user interface fordisplaying the information (e.g. XML data) compiled by and received fromSAN management server 200 in graphical and/or textual format, and mayprovide a user interface for accessing various features of the SANmanagement system such as tools and utilities. SAN manager 202 may runon any of a variety of end-user platforms coupled to one or more of thehosts 102, typically via a LAN or WAN, or alternatively may run on oneof the hosts 102, including the host 102 that includes SAN managementserver 200. One embodiment may provide in-context launch support forelement managers supplied by device vendors to provide vendor-specificmanagement. In one embodiment, to directly manage a device, theadministrator may telnet to the device via the SAN manager.

Embodiments may provide customizable, intuitive views into a SAN basedon host 102, device, fabric 100, or storage groups, as well as real-timealerts to diagnose and avoid outages. In one embodiment, SAN manager 202may serve as a centralized point from which a user may view informationabout a SAN, including, but not limited to, information about the SAN'stopology and heterogeneous components. In one embodiment, SAN manager202 may provide a graphical user interface (GUI) to display informationfrom the SAN access layer and other SAN management server components.

In one embodiment, SAN manager 202 may provide a GUI for facilitatingmanagement by allowing the user to graphically drill down into thelogical and physical devices on the SAN. One embodiment may provide theability to zoom in or out on areas of interest in a SAN topology map tosimplify the navigation of a growing enterprise SAN. Within the topologymap, integrated tool tips may be provided to help identify devices andpaths (routes) in the SAN without having to navigate through a complextopology. Information on SAN devices, such as hosts 102 with Host BusAdapters (HBAs), interconnects, and storage devices 104, may bedisplayed in context in the GUI, revealing resources in zones as theyare physically and logically connected. One embodiment may include asearch mechanism. For example, if the administrator wants to ensure thatall interconnects in the SAN are at the same firmware level, theadministrator may query an integrated search tool for firmware levels toautomatically locate all the devices that match the search criteria forthe specific firmware level.

One embodiment may provide a real-time alert viewer that may monitorheterogeneous device status, and may provide proactive managementcapabilities in the SAN environment. Through policies, the status andperformance of the device(s) may be monitored, and alerts may begenerated when behavior falls outside acceptable boundaries. Embodimentsmay enable intelligent monitoring through user-definable thresholdlevels and may perform actions automatically as well as notifyadministrators of critical events in real time.

Embodiments may provide both real-time and historical performance datafor critical service-level parameters such as connectivity, availablespace and throughput. One embodiment may enable real-time performancecharting of SAN devices. Embodiments may monitor interconnect andstorage devices in real time, and may be used to display informationabout the various SAN devices such as current load/status. Throughreal-time performance monitoring, with flexible user-defined thresholds,one embodiment may notify administrators about issues that could affectoverall SAN performance before the issues have an impact. Logging thisdata for reporting may, for example, extend the administrator'scapability to audit and validate service-level agreements.

One embodiment may include a SAN reporter that enables the user togenerate and view reports on details of the SAN. In one embodiment, theSAN manager may serve as a centralized point from which reports may begenerated and viewed. Embodiments may provide both real-time andhistorical performance data for critical service-level parameters suchas connectivity, available space and throughput. In one embodiment, theSAN management server may collect SAN data that may be provided asreal-time and/or historical performance data to the SAN reporter for usein generating SAN performance reports. One embodiment may include“out-of-the-box” or predefined reports that allow users to inventory andanalyze their SANs. Embodiments may provide detailed capacity reports toaid in growth planning and gathers detailed information for use inchargeback reports. One embodiment may track LUN allocation to hosts aswell as to storage groups, distilling real-time and historical reportsthat show where storage resources are being consumed.

FIG. 8 illustrates the architecture of the SAN management systemaccording to one embodiment. This embodiment may be based on distributedclient-server architecture, and may be divided into components that mayinclude a SAN manager 202, a SAN management server 200, and a SAN accesslayer 204. The functions of SAN management server 200 may include one ormore of, but are not limited to: automatically discovering SAN-attachedobjects including hosts, HBAs, switches and storage devices; maintaininga data store of real-time object information; managing SAN resourcesthrough zoning and LUN access control; monitoring conditions on the SAN;performing policy-based actions in response to SAN conditions;generating inventory and performance reports; and supportinguser-defined grouping of objects based on quality of service (QoS)criteria.

By discovering objects and the relationship of these objects to eachother, SAN access layer 204 may maintain a real-time topology of theSAN. SAN access layer 204 may also directly interface with switches onone or more fabrics to manage the zoning of storage resources. SANaccess layer 204 may discover additional information about objects onthe SAN that SAN management server 200 cannot discover directly, such asdevices on a separate zone or fabric 100.

SAN manager 202 may be a central point for the user to perform one ormore of SAN management tasks including, but not limited to,administering the SAN, viewing topographical displays of discoveredobjects on the SAN, accessing detailed information on componentsincluding object attributes and connectivity, creating and modifyingpolicies, administering access control through zoning and LUN security,monitoring SAN events including real-time alerts, allocating storageresources, generating and viewing inventory and performance reports,generating and viewing real-time and historical reports, and/orlaunching utilities, tools and applications, which may includethird-party management tools. In one embodiment, other applications,such as a Web browser, may function as clients to SAN management server200. In one embodiment, multiple SAN managers 202 may connectsimultaneously with SAN management server 200. One embodiment mayinclude a command line interface that enables the user to query andmodify SAN management server alarm service objects, configurationsettings and perform other related SAN management system tasks.

FIG. 9 illustrates the architecture of SAN access layer 204 according toone embodiment. In one embodiment, SAN access layer 204 may include anengine 250 that may perform one or more functions which may include, butare not limited to, coordinating the activity of explorers 206, managingchanges to data store 254, and performing zoning operations bycommunicating with switches on fabric 100. In one embodiment, SAN accesslayer 204 may include one or more explorers that provide an interface todifferent types of heterogeneous SAN components so that the SANmanagement system may provide a common data representation forheterogeneous SAN components. Explorers 206 may communicate with the SANcomponents over Fibre Channel (in-band) and/or Ethernet (out-of-band)connections to inventory the SAN. Each explorer may communicate with aspecific type of device using a protocol available for that specifictype of device.

Once the SAN is discovered, SAN access layer 204 may continue to monitorthe SAN and may update data store 254 as new events occur on the SAN. Inone embodiment, SAN access layer 204 may periodically examine the SAN,for example to discover or determine objects that are added, objectsthat are removed, and connections that are pulled. In one embodiment,data gathered by the explorers may be aggregated into data store 254,which may be updated with real-time information about objects on theSAN. In one embodiment, SAN access layer engine 250 may manage datastore 254. In one embodiment, data store 254 may be an embedded,ODBC-compliant, relational database. In one embodiment, data from thedatabase may be imported into a data warehouse to track changes andanalyze the SAN over periods.

In one embodiment, SAN access layer 204 may include an agent 252 thattranslates information from data store 254 into formatted files (e.g.XML files), which may be provided to client applications such as SANmanager 202 or Web browsers. Agent 252 may also enforce userauthentication for commands sent to SAN management server 200, and mayhandle communication between SAN management server 200 and any hostsrunning a SAN access layer remote (described below).

In one embodiment, SAN manager 202 is a client of SAN access layer 204,and may graphically and/or textually display objects discovered by SANaccess layer 204. In one embodiment, SAN manager 202 may open aconnection (e.g. TCP/IP socket) with SAN access layer agent 252 and senda message (e.g. an XML message) requesting data stored in data store254. Upon receiving the request, SAN access layer engine 250 maydynamically create a document (e.g. an XML document) describing the SANtopology. SAN access layer agent 252 then may send this document to SANmanager 202. Once SAN manager 202 successfully receives the message, SANaccess layer agent 252 may close the connection. When SAN manager 202receives the document, it may read the file and display, in graphicaland/or textual format, the information the document provides about theSAN.

In one embodiment, the data generated by SAN access layer 204 may be ina format (e.g. XML) that may be read by a Web browser or exported to afile that may be opened and edited using a standard text editor. In oneembodiment, a SAN's current state may be captured in a file, e.g. an XMLor other markup language file. Thus, snapshots of the SAN may be savedover time, which may be analyzed and compared to current conditions onthe “live” SAN.

In one embodiment, SAN access layer 204 may be configured for discoveryand device communication through a configuration file. The configurationfile may include one or more parameters for the SAN access layer and/orglobally for the explorers. Each type of explorer may have a section inthe configuration file that may include one or more parameters specificto the particular type of explorer.

FIG. 10 illustrates an exemplary SAN and further illustrates thearchitecture and operation of the SAN management system according to oneembodiment. This embodiment may be based on a distributed client-serverarchitecture, and may be divided into components which may include a SANmanager 202, a SAN management server 200, a SAN access layer 204 and adatabase 226. In this embodiment, SAN access layer 204 may be acomponent or “layer” of SAN management server 200. SAN management server200 may also include a policy service 220 and an alarm service 222.

In one embodiment, one or more explorers 206D may be included within SANaccess layer 204. In one embodiment, SAN access layer 204 may aggregateinformation gathered by explorers 206D into a SAN access layer 204 datastore. Once the SAN is discovered, SAN access layer 204 may periodicallyexamine the SAN for objects that are added, objects that are removed,and connections that are pulled. In one embodiment, new explorers 206may be added as needed or desired. For example, if a new type of SANdevice is added to the SAN, or an existing type of SAN device ismodified or upgraded, an explorer 206 may be added or updated tocorrectly communicate with the new or updated type of SAN device.

Explorers 206 may use different methods to discover information aboutheterogeneous SAN objects. In one embodiment, explorers 206 may queryobjects on the SAN to retrieve a standard set of attributes for eachtype of object. The terms “information” and “details” may be used todescribe the different kinds of data about a SAN that may be discovered,including, but not limited to, SAN events, zone memberships,connectivity, etc. The term “attributes” refers to a subset of thatlarger body of information. Attributes are details that are particularto a type of object, such as a switch—details such as its vendor, modelnumber, firmware version, port count, World Wide Name (WWN), andout-of-band address.

Explorers 206 may be categorized into types including, but not limitedto, switch explorers, zoning explorers, disk array explorers, and HostBus Adapter (HBA) explorers. Switch explorers may discover switchinformation such as vendor name, firmware version, and model name.Switch explorers may include, but are not limited to, a managementserver explorer and an out-of-band switch explorer. A management serverexplorer may communicate with supported switches over Fibre Channelconnections. In one embodiment, the management server explorer may usethe Fibre Channel Common Transport (CT) protocol to communicate withswitches in fabric 100. The management server explorer may, for example,discover switches in-band over Fibre Channel, obtain switchcharacteristics, and/or explore port connectivity. In one embodiment,the management server explorer may optionally run over IP networks. Forsome switches, the management server explorer may run out-of-band. Inone embodiment, the management server explorer may perform in-bandzoning.

One embodiment may include an out-of-band switch explorer to communicatewith switches (or their proxies) over Ethernet. In one embodiment, theout-of-band switch explorer may discover devices managed over any IPnetwork. In one embodiment, the out-of-band switch explorer may use SNMP(Simple Network Management Protocol). SNMP is a protocol for monitoringand managing systems and devices in a network. The data being monitoredand managed is defined by a MIB (Management Information Base), thespecification and formal description of a set of objects and variablesthat can be read and possibly written using the SNMP protocol. Someembodiments may use other network protocols, for example CommonManagement Information Protocol (CMIP), Remote Monitoring (RMON), etc.Enabling the out-of-band switch explorer may include specifying IPaddresses for each switch (or for multiple switch fabrics, each proxy)in a SAN access layer configuration file.

Zoning explorers may be used as an interface for SAN access layer 204 tocommunicate with fabric switches to perform discovery and control ofzones in the SAN. When users issue zoning commands, SAN access layer 204may use a zoning explorer to contact the switch to perform the zoningoperation. In one embodiment, zoning explorers may communicate with theswitches out-of-band. Embodiments may provide zoning explorers specificto fabric switches provided by various switch vendors. In oneembodiment, one or more zoning explorers may complete transactions witha switch management server (name server) to discover zone names andattributes and to perform switch zoning commands.

HBA explorers may discover information about SAN-connected storagedevices 104 that are zoned to a host 102 that is running a SANmanagement server 200 or where a SAN access layer remote 230 (describedbelow) resides. The HBA explorer may interact with a host 102 todiscover HBAs and device paths. A device path may be defined as a routethrough an interconnect that allows two or more devices to communicate.In one embodiment, an HBA explorer may not discover locally attachedstorage (e.g. disks or other devices attached through a SCSI or IDEcontroller). If these storage devices have OS handles, then the HBAexplorer may return LUN names and attributes. An OS handle may be usedby the operating system to identify a storage resource (known as anAddressable Unit, or AU), and the correct methods (e.g. driver/systemcall) to access the storage resource. If no OS handles are available,then the HBA explorer may identify the device as a generic device (ablock device attached to a port on the host).

Disk array explorers may provide information about array names and theirattributes, such as number of ports and the number of disks contained inan array. Disk array explorers may discover disk arrays/enclosures andtheir LUNs. Disk array explorers may pass LUN management commands to thearray's management interface (e.g. CCS or SYMCLI) to execute. In oneembodiment, disk array explorers may discover LUNs that are not maskedto discovered hosts. SAN access layer 204 may include disk arrayexplorers specific to disk arrays of various vendors. In one embodiment,disk array explorers may start when SAN access layer 204 starts. In oneembodiment, the disk array explorers may check to see if host 102 has amanagement interface. If host 102 does not have the managementinterface, the corresponding explorer may be disabled. If the managementinterfaces are present, the explorers may determine if the host hasaccess to any LUNs exported by the array. If any LUNs are available, theexplorers may attempt to discover the array using the OS handle of theLUN. In one embodiment, some disk array explorers may use an out-of-bandnetwork protocol such as SNMP to communicate directly with the diskarray controller. IP addresses for each disk array may be supplied forSAN access layer 204 discovery and communication. In one embodiment, SANaccess layer 204 may communicate with a disk array through the array'smanagement interface. In one embodiment, the array vendor's managementsoftware is installed on a host 102 with an in-band connection to thearrays to be managed. The management software may provide a unifiedinterface/command interpreter between the SAN management system and thearrays on the fabric. In one embodiment, a SAN management server 200 ora SAN access layer remote 230 is installed on the host 102 that isrunning the management software in order to communicate with the arrays.

In one embodiment, SAN access layer 204 may automatically discoverinformation for each Addressable Unit (LUN) that is under the control ofa volume manager. In one embodiment, SAN management server 200 maydiscover information about HBAs on other hosts 102 attached to fabrics100 discovered by SAN management server host 102A.

One embodiment may include a SAN access layer remote 230 that may beinstalled on one or more other hosts 102 in the SAN, if any, to assistSAN management server 200 in discovering the entire SAN. In oneembodiment, SAN access layer remote 230 may be installed on every host102 on the SAN (excepting the host including the SAN access layer 204)to provide complete and accurate discovery. In one embodiment, eachinstallation of SAN access layer remote 230 may include one or moreexplorers 206E. In one embodiment, explorers 206E may include one ormore explorers 206 that may also be used by SAN access layer 204, suchas a management server explorer and an HBA explorer. In one embodiment,explorers 206E may also include an out-of-band switch explorer. In oneembodiment, SAN access layer 204 and each installation of SAN accesslayer remote 230 may each include a set of one or more explorers 206that may be determined by the discovery requirements and/or contents ofthe region of the SAN visible to the host 102 on which SAN access layer204 or the installation of SAN access layer remote 230 resides. Eachinstallation of SAN access layer remote 230 may provide informationgathered by explorers 206E to SAN access layer 204, which may aggregatethis information into SAN access layer 204 data store. In oneembodiment, SAN management server 200 communicates with SAN access layerremote(s) 230 across an HTTP connection. In one embodiment, SANmanagement server 200 may use XML to communicate with SAN access layerremote(s) 230. Other embodiments may use other connections and othercommunications protocols.

In one embodiment, to get detailed information about a remote host 102,SAN access layer remote 230 may be installed on the host 102, and thehost 102 may be added to a SAN access layer configuration file on SANmanagement server 200. In one embodiment, a host 102 running SAN accesslayer remote 230 may be specified as either a “Host” or an “In-BandHost” in the SAN access layer configuration file. The “Host” entry maybe used to define other hosts 102 attached to the SAN. The “In-BandHost” entry may be used to define at least one SAN access layer remotehost 102 per each fabric 100 that is not attached to and thus notdiscovered by SAN management server 200. In one embodiment, if SANaccess layer remote 230 is not installed on a host 102, SAN managementserver 200 may still discover the HBA, and the enclosure utility may beused to accurately visualize the host in SAN manager 202's userinterface.

In one embodiment, policy-based management may enable the monitoring ofconditions on a SAN and may facilitate quick response when problemsoccur. Conditions that may be monitored may fall into one or morecategories of interest to storage administrators. Embodiments may useone or more methods for monitoring conditions on a SAN. These methodsmay include, but are not limited to, out-of-band polling (e.g. SNMPpolling), traps (e.g. SNMP traps) and SAN access layer 204. SAN accesslayer 204 may provide notification of SAN events such as the addition ordeletion of SAN components such as SAN fabrics, switches and arrays. Oneembodiment may monitor conditions in-band, e.g. using the Fibre ChannelCommon Transport (CT) protocol.

Among other SAN monitoring methods, SAN management server 200 mayreceive SNMP traps from elements on the SAN. To monitor conditions on aSAN using SNMP traps, some SAN objects may send SNMP traps to SANmanagement server 200 when an event happens. SNMP-capable devices on theSAN may be configured to send traps to the host 102A running SANmanagement server 200. In one embodiment, these traps are asynchronous,so the SAN management system cannot poll such an object to determine thecurrent condition. This embodiment may be dependent on the trap senderto report when a condition changes by sending additional traps. Inanother embodiment, objects may be polled directly to determine thecurrent condition. In one embodiment, to monitor an object on a SAN, theobject may include an SNMP agent that is configured to accept SNMP pollsand to send SNMP traps.

One embodiment may include collectors. A collector may be a path orchannel through which a specific type of data is gathered for a specificobject type. Collectors may include one or more of, but are not limitedto, collectors for object availability, environmental conditions, deviceerrors, and SAN traffic. Collectors may monitor properties such asswitch port status, dropped frames, disk temperature, link failures andso on, which may be evaluated by policy service 220 to create anaccurate composite status of the SAN. In one embodiment, the status ofdevices may be displayed on a topology map of a SAN manager 202 userinterface, for example using color-coded icons. In one embodiment, thesecollectors may be based on devices' SNMP MIB variables. One embodimentmay include one collector per data type per object, for each object thatcan be monitored. In one embodiment, each collector may be associatedwith an object type, such as a SAN host 102 or a switch port. In oneembodiment, each collector may be associated with a type of data, forexample textual state or numeric threshold data. Collector data may beused in real-time collector graphs, the policy engine, and the SANreporter, for example.

One embodiment may include a policy service 220 that manages policiesassociated with objects on the SAN. Policies may be rules used to helpmanage a SAN by automating responses to certain events and conditions.Policies may detect when something goes wrong, and may be used toanticipate and handle problems before they occur. A policy may indicatea particular object or type of object to monitor. In general, any objectfor which at least one collector is provided may be monitored. Objectsthat may be monitored include, but are not limited to, fabrics 100,switches, switch ports, hosts 102, and disk arrays. One embodiment mayinclude a set of policies that monitor SAN management server 200. Apolicy may include a description of a condition to monitor on an object,such as a high percentage of bandwidth utilization on a switch port, anda set of actions to take when that condition is met. A policy mayindicate one or more actions to be taken when the condition is detected.In one embodiment, policy service 220 may be integrated with SAN manager202, permitting users to view what policies are in effect on their SAN,to define and modify policies, and to generate inventory and performancereports based on the conditions monitored by policy service 220. In oneembodiment, SAN manager 202 may include a policy utility to facilitatepolicy creation and maintenance. The policy utility may lead a userthrough the steps of providing the information described above to createuser-defined policies. The user may use the policy utility to makechanges in predefined or user-defined policies as desired.

One embodiment may include a policy engine that performs theinstructions described in all policies enabled on the SAN. In oneembodiment, the policy engine may be a component or process of policyservice 220. When the objects on the SAN are discovered, collectorscorresponding to the objects may be determined and the relevantcollectors may be registered with the policy engine. The policy enginethen may receive a stream or streams of real-time collector data andcompare data values with the conditions described in its policies. Whenthe alarm condition for a particular policy is met, the policy engineperforms the actions described in the policy.

An alarm is a signal that is generated by a policy when the conditionspecified in the policy is detected or evaluated as true. An alarm maybe triggered if the condition and alarm action are configured in thepolicy. An alarm is an internal signal used by the SAN managementsystem. An alert to SAN manager 202 is a configurable response that mayresult from an alarm being triggered. When an alarm is triggered, thealarm may be referred to as active. In one embodiment, alarms may bedynamic—the alarm resets itself automatically when the conditionmonitored by the policy returns to a specified “clear state.” The clearstate for a condition may specified either manually or automatically,depending on whether the condition is a threshold or a textualcomparison condition. One embodiment may include an alarm service 222that may monitor and collect status and performance information from theSAN using both out-of-band (e.g., SNMP) and SAN access layer 204 events.This collector information may be fed into policy service 220 to triggerpolicy actions and for logging for reporting purposes. In oneembodiment, data collected by the alarm service may be logged indatabase 226.

The conditions available for a policy may be determined by the type ofobject being monitored. Different types of policy conditions may resultin different types of alarms. There may be different types of conditionsfor various objects managed by SAN management server 200. One type ofpolicy is a threshold condition with action policy which may be used tomonitor an object and detect when a particular numeric threshold isreached and sustained for a configurable period. Another type of policyis a text comparison condition with action policy that may be used toevaluate a textual state to determine the status or condition of theresource.

For every policy, one or more actions to be taken when the specifiedcondition is detected may be configured. Actions may, for example,perform corrective and/or notification functions. One type of policyaction is a console alert, which may send an alert to SAN manager 202when the specified condition is detected. The desired level of severityassociated with the action may be configurable. Another type of policyaction is a command or script (e.g., a PERL script) that executes acommand or executable file specified for the action. Yet another type ofpolicy action is to send e-mail notification to one or more specifiedrecipients. In one embodiment, policy service 220 may be configured tosend traps (e.g. SNMP traps) as notifications to applications. In oneembodiment, policy action options may also include paging and InstantMessaging.

In one embodiment specific hardware alerts may be forwarded to alert onthe applications that will be affected by the hardware problems. In oneembodiment application alerts and/or hardware alerts may be forwarded tocreate alerts for specific departments. This may preferably provide atop-down alert hierarchy.

In one embodiment, SAN manager 202 may serve as a centralized point fromwhich a SAN administrator or other user may create and manage groups ofSAN objects, including groups of heterogeneous components. Oneembodiment may provide a group utility for creating and managing logicalgroups of SAN objects including hosts 102, storage device 104interconnects, other groups, and other objects that may be members of agroup. A group may be defined as an arbitrary set of SAN elementsdefined by an administrator to help organize and provision resources,and may be implemented by storage administrators to identify andmanually provision available storage devices 104 that match the qualityof service requirements of particular user groups or applications. Thegroup utility may be used to create logical storage groups where devicemembership may be based on zoning, LUN masking, hosts etc., and may alsobe based on the need for a collection of devices to be viewed as oneentity for activities such as reporting, configuring and monitoring SANresources.

One embodiment may support one or more types of groups, including, butnot limited to, generic groups, storage accounts, and storage groups. Inone embodiment, groups may be nested within other groups. Generic groupsmay include switches, hosts 102, storage devices 104, and/or nestedgroups of any group type. Storage accounts may include hosts 102,storage devices 104, and/or nested groups (storage accounts or storagegroups only). A storage account may include one or more host objects andall the storage that the administrator assigns to them. Storage groupsmay include storage devices 104 and/or nested groups (storage groupsonly). Storage groups may be used to categorize storage resources byquality of service criteria including, but not limited to, cost,performance, capacity and location.

The flexible connectivity capabilities of the SAN storage model may posesecurity risks. Zoning helps alleviate that risk by providing a methodof controlling access between objects on the SAN. By creating andmanaging zones, the user may control host 102 access to storageresources. In one embodiment, the SAN manager may serve as a centralizedpoint from which an administrator or other user may create and managezones of SAN objects, including zones of heterogeneous components. Azone is a set of objects within a SAN fabric that can access oneanother. Zones and their member objects may be defined in zoning tableswithin the switches on the SAN fabric 100. When zoning is implemented ona SAN fabric 100, the switches consult the zoning table to determinewhether one object is permitted to communicate with another object, andrestrict access between them unless they share a common membership in atleast one zone. Fabric zoning occurs at the level of individual nodes orports attached to the SAN fabric 100. Zoning-enabled fabrics 100 mayinclude zoning tables that define each zone along with its memberobjects. These zones function similar to virtual private networks (VPNs)on traditional networks.

There may be one or more ways to use zoning to improve the security andorganization of the SAN. Examples of uses of zoning include, but are notlimited to: isolating storage resources for different operatingenvironments, such as separating UNIX storage from Windows NT storage;setting aside resources for routine backups; securing areas of the SANfor storage of sensitive data; and creating dedicated resources forclosed user groups.

In one embodiment, the SAN management system may provide methods toenforce the access restrictions created by zones on the SAN. Thesemethods may include two methods that correspond to the forms of zoningcommonly referred to as soft zoning and hard zoning.

Soft zoning, also called advisory zoning, may be enforced simply byfiltering the visibility of objects on the SAN so that an object canonly see other objects that share at least one zone membership with theobject. At boot time, a SAN host 102 or device requests a list of theWorld Wide Names (WWNs) on the SAN fabric 100 from the fabric NameService. The Name Service may consult the zoning table and filter out ofits response any WWNs that are not zoned together with the host 102 ordevice making the request. In this way, a host 102 on the SAN is onlymade aware of devices whose WWNs are zoned together with the hosts's HBAport. Soft zoning is flexible because it does not rely on an object'sphysical location on the SAN. If its physical connection to the SANfabric 100 changes, its zone memberships remain intact because the zonememberships are based on the WWNs of the object's ports. However, softzoning may have security vulnerability in that it does not activelyprevent access between objects that belong to different zones. Even ifthe Name Service does not supply a SAN host 102 with the WWN of a devicethat is zoned away from the host 102, a user who knows that WWN (or ahacker trying different combinations of addresses) may still send FibreChannel packets from the host 102 to that device.

When hard zoning is implemented, a Fibre Channel switch may activelyblock access to zone members from any objects outside the zone. This maybe performed at the level of ports on the switch. Hard zoning may alsobe referred to as switch port zoning. The switch checks each incomingFibre Channel packet against its routing table to see whether the packetmay be forwarded from the entry port to its destination port. Switchport zoning offers strong security because it actively segregates zonemembers from the rest of the SAN fabric 100. However, hard zoning maylack the flexibility of soft zoning, since an object attached to a zonedswitch port loses its zone membership when it is physically disconnectedfrom that switch port and moved elsewhere on the SAN. New objectsattached to the switch port may inherit the zone memberships of thatport, so planning and record keeping by the administrator may be neededto avoid breaks in security when moving objects around on the SAN.

In one embodiment, the SAN management system may support the zoning ofobjects on the SAN including, but not limited to, switch ports, hosts102, and storage devices 104 including, but not limited to, storagearrays, JBODs, and individual storage devices. In one embodiment, theSAN management system may support switch zoning though applicationprogram interfaces (APIs) provided by switch vendors, allowing for bothhard (port-level) and soft (advisory, WWN) zoning. Zoning may beimplemented and used by storage administrators using one or more SANmanagement system services, tools and/or utilities for allocatingstorage resources and managing SAN security, and optionally one or morethird-party tools, utilities or applications. In one embodiment, the SANmanager may serve as a centralized point from which a manager or otheruser may access SAN management system and/or third-party services,tools, applications, and/or utilities to create and manage zones on theSAN, including zones containing heterogeneous SAN objects.

In one embodiment, the SAN management system may provide a zone utilitythat may facilitate the creation, modification, and deletion of zones.In one embodiment, the zone utility may be provided through the SANmanager. The zone utility may provide storage zone definition, creationand management. The zone utility may be used to administer zonesdirectly and visually; and may reduce or remove the need to use telnetcommands or proprietary, hardware-specific Web-based solutions. The zoneutility may facilitate the creation of new zones and edits to existingzones. The zone utility may automatically filter the list of objects onthe SAN and present a list of objects that are available to be added toa zone. In one embodiment, an object may be zoned based on the WorldWide Name (WWN) of the object node, the WWN of an individual port underthe object node, or the switch port to which the object is attached. Inone embodiment, users may administer zoning though the zone utility oroptionally through a command line interface.

There may be no industry-wide standard for zoning, and thus differentvendors' switches may implement switch zoning in different ways. Thus,one embodiment of the SAN management system may use a switch-neutralapproach to zoning. This embodiment may not specify, for example,whether hard zoning (port-level zoning) or soft zoning (based on WWNs)should be applied in any particular case. In this embodiment,implementation details such as these may be left up to the switchvendor. Embodiments may also provide datapath zoning control forinterconnects from vendors such as Brocade, QLogic, and McDATA using thezone utility to abstract the individual interconnects' complex zoningtools to simplify creating, adding to, and deleting zones.

Ensuring that SAN applications have the required storage resources mayinclude providing secure storage from storage devices 104 (e.g. diskarrays, tape backup devices, etc.) to hosts 102 within the SAN. In oneembodiment, the SAN management system may integrate storage masking fromvarious array providers, for example Hitachi Data Systems, Compaq andEMC, to hosts 102 in the SAN. LUN (Logical Unit Number) security is thecollective name given to the operations involved in making storagedevice 104 resources available to hosts 102 on a SAN. In one embodimentof the SAN management system, LUN security may provide granular controlover host 102 access to individual LUNs within an array or othercollection of potentially heterogeneous storage devices. LUN securitymay include LUN locating or searching, LUN binding, LUN masking, andfabric zoning. In one embodiment, the SAN manager may serve as acentralized point from which the administrator or other user may manageLUN security for heterogeneous SAN components.

A LUN is the SCSI (Small Computer System Interface) identifier of alogical unit within a target, the system component that receives a SCSII/O command. A logical unit is an entity within a SCSI target thatexecutes I/O commands. SCSI I/O commands are sent to a target andexecuted by a logical unit within that target. A SCSI physical disktypically has a single logical unit. Tape drives and array controllersmay incorporate multiple logical units to which I/O commands can beaddressed. Each logical unit exported by an array controller correspondsto a virtual disk.

LUN security may include LUN binding, the creation of access pathsbetween an addressable unit (which may also be referred to as anAddrUnit, an AU, a unit, a volume, a logical unit, a logical disk, or alogical device) within a disk array and a port on the array. FIG. 11illustrates LUN binding according to one embodiment. In the LUN bindingprocess, an AU 288 is bound to a specified array port 286 (e.g. arrayport 286A or 286B) in a specified storage device 104 (e.g. a storagesystem/disk array)). This results in the creation of a LUN 282. AUs288A, 288B, 288C, and 288D are storage volumes built out of one or morephysical discs within the storage device 104. Array ports 286A and 286Bare connected to the SAN fabric 100 and function as SCSI targets behindwhich the AUs 288 bound to those ports 286 are visible. “LUN” is theterm for the access path itself between an AU 288 and an array port 286,so LUN binding is actually the process of creating LUNs 282. However, aLUN 282 is also frequently identified with the AU 288 behind it andtreated as though it had the properties of that AU 288. For the sake ofconvenience, a LUN 282 may be thought of as being the equivalent of theAU 288 it represents. Note, however, that two different LUNs 282 mayrepresent two different paths to a single volume. A LUN 282 may be boundto one or more array ports 286. A LUN 282 may be bound to multiple arrayports 286, for example, for failover, switching from one array port 286to another array port 286 if a problem occurs.

LUN security may also include LUN masking to enable access to aparticular Addressable Unit for a host on the SAN. FIG. 12 illustratesLUN masking according to one embodiment. LUN masking is a securityoperation that indicates that a particular host 102 (e.g. host 102A or102B), HBA (Host Bus Adapter) 284 (e.g. HBA 284A or 284B), or HBA port292 (e.g. HBA port 292A or 292B) is able to communicate with aparticular LUN 282. In the LUN masking process, a bound AU 288 (e.g. AU288A, 288B, 288C or 288D) may be masked to a specified HBA port 292, HBA284, or host 102 (e.g. all HBAs on the host) through a specified arrayport 286 in a specified storage device 104. When an array LUN 282 ismasked, an entry is added to the Access Control List (ACL) 290 (e.g. ACL290A, 290B, 290C, 290D, or 290E) for that LUN 282. Each ACL 290 includesthe World Wide Name of each HBA port 292 that has permission to use thataccess path—that is, to access that AU 288 through the particular arrayport 286 represented by the LUN 282.

LUN masking may be thought of as the removal of a mask between an AU 288and a host 102 to allow the host to communicate with the LUN 282. Thedefault behavior of the storage device 104 may be to prohibit all accessto LUNs 282 unless a host 102 has explicit permission to view the LUNs282. The default behavior may depend on the array model and, in somecases, the software used to create the AU 288.

LUN security may also include fabric zoning. FIG. 13 illustrates fabriczoning according to one embodiment. After a LUN is masked to an HBA port292 (e.g. HBA port 292A, 292B or 292C) in a host, the zoningconfiguration of the SAN fabric 100 may still prevent the host fromaccessing the AU behind that LUN. In order for the host to see the AUand create an Operating System (OS) handle for it, there must be atleast one zone on the fabric 100 that contains both the HBA port 292(e.g. HBA port 292A, 292B or 292C) and the array port 286 (e.g. arrayport 286A or 286B) to which the AU is bound. A zoning operation may berequired if the HBA port 292 and array port 286 are not already zonedtogether. Zoning operations may include creating a new zone 294 andadding the array port 286 and the HBA port 292 to an existing zone 294.Zones 294 may also include one or more ports on one or more fabricdevices (e.g. switches 108A and 108B) in the device path between thearray port 286 and the HBA port 292. Fabric zoning occurs at the levelof individual nodes or ports attached to the SAN fabric. Zones and theirmember objects may be defined in zoning tables within the switches 108on the SAN fabric. When zoning is implemented on a SAN fabric, theswitches 108 consult the zoning table to determine whether one object ispermitted to communicate with another object, and restrict accessbetween them unless they share a common membership in at least one zone.

In FIG. 13, zone 294A includes HBA port 292A, the array ports 286A and286B through which HBA port 292A may access LUNs bound to the arrayports 286, and the switch ports on switches 108A and 108B through whichHBA port 292A and array ports 286 are coupled. Zone 294B includes HBAport 292C, array port 286B through which HBA port 292C may access LUNsbound to the array port 286B, and the switch port(s) on switch 108Bthrough which HBA port 292C and array port 286B are coupled. HBA ports292A, 292B and 292C may be on the same host or on different hosts and,if on the same host, on the same HBA or on different HBAs. Array ports286A and 286B may be on the same storage system or on different storagesystems. For more information on zoning, see the description of zoningabove.

In one embodiment as illustrated in FIG. 14, the SAN management server200 may discover SAN components including, but not limited to, one ormore storage devices 104 (e.g. storage devices 104A and 104B) eachincluding one or more addressable storage units and one or more fabricports for coupling to the SAN, and one or more host systems 102 eachincluding one or more host bus adapters (HBAs) 284 which each providehost adapter ports for coupling to the SAN. The SAN manager 202 clientmay access the SAN management server to provide a user interface forselecting addressable storage units to be made available to selectedhost adapter ports and to communicate with the SAN management server tocreate access paths between selected addressable storage units andselected fabric ports of the storage systems, enable access to theselected addressable storage units for the selected host adapter ports,and zone the selected storage system fabric ports in a common fabric 100zone with the selected host adapter ports.

In one embodiment, the SAN management system may provide a LUN securityutility 280, which may combine LUN security operations including, butnot limited to, searching for and locating one or more LUNs 282, LUNselection, LUN to disk array port binding, LUN masking and fabric zoningoperations in one utility. In one embodiment, the LUN security utility280 may be provided to the user through the SAN manager 202 userinterface. In one embodiment, the SAN manager may run on anadministration system 106. In one embodiment, the LUN security utility280 may provide a central utility that, through a graphical userinterface, guides the user through configuring LUN security operations(finding and selecting one or more LUNs, binding, masking and zoning)and allows the user to execute the configured LUN security operationswith a single operation, for example, a single click of a button in theuser interface. Thus, the LUN security operations (finding and selectingone or more LUNs, binding, masking and zoning) may be performed as asingle operation from the perspective of the user.

In one embodiment, if any portion of the LUN security operation(binding, masking, and/or zoning) configured and initiated by the userfrom the LUN security utility fails to successfully complete, the LUNsecurity utility may “back out” of the entire configured LUN securityoperation, and may undo any portions of the LUN security operationalready completed and/or leave undone any portions not yet performed. Byso doing, the LUN security operation may leave the various SANcomponents being operated on by the LUN security operation in theiroriginal state before the start of the operation if any portion of theoperation fails. Thus, LUN security operations configured and initiatedusing the LUN security utility may be viewed as transactions. Atransaction may be defined as a sequence of information exchange andrelated work that is treated as a unit for the purposes of satisfying arequest and for ensuring data integrity. For a transaction to becompleted and changes to be made permanent, a transaction has to becompleted in its entirety.

The SAN management system may provide a single point of management fromlogical units of storage devices 104 to interconnect to SAN-connectedhosts 102. The LUN security utility 280 may provide a central point fromwhich to perform LUN security operations including LUN binding (thecreation of access paths (LUNs) between Addressable Units within a diskarray and ports on the array), LUN masking (enabling access toAddressable Units for host HBA ports) and fabric zoning (allowing thehost to see the AU and create an Operating System (OS) handle for it).

The LUN security utility 280 may guide users through searching andlocating, selecting, binding, masking and zoning operations. The LUNsecurity utility 280 may be used to bind LUNs 282 to ports on the arrayand further mask the LUN(s) to target host HBA 284 ports. The LUNsecurity utility 280 may include safety controls to ensure that invalidLUN binding and LUN masking configurations are not created. The LUNsecurity utility 280 may support multiple storage array vendors, andthus may serve as a centralized utility for performing LUN securityoperations for heterogeneous SAN components.

Using the LUN security utility 280, users may specify LUNs 282 and diskarray ports to bind. In one embodiment, the SAN management system mayprovide a LUN query tool for finding and selecting LUNs 282. Users mayalso use the LUN security utility 280 to select hosts' HBA 284 ports andLUNs 282 for LUN masking/security. The LUN security utility 280 mayallow users to select a zone that contains the array port and a host'sHBA port(s). If no such zone exists, the LUN security utility 280 mayallow users to create a new zone or add the array port and the host'sHBA 284 port(s) to an existing zone.

The component of the SAN management system that manages SAN discovery isthe SAN access layer (not shown). Functions of the SAN access layer mayinclude discovery and zoning. In one embodiment, the SAN access layermay be a component or “layer” of the SAN management server 200. In oneembodiment, the SAN access layer may include one or more explorers (e.g.disk array explorers) that may discover storage devices 104 (e.g. diskarrays and enclosures) and information about the storage devices 104such as the storage devices' ports, addressable units and LUNs 282. Inone embodiment, the SAN access layer may discover LUNs 282 that are notmasked to HBA 284 ports on discovered hosts 102 on the SAN. In oneembodiment, the SAN access layer may also include one or more explorers(e.g. HBA explorers) that may interact with SAN hosts 102 to discoverinformation about the hosts 102 such as the hosts' HBAs 284, HBA portsand device paths. In one embodiment, the SAN access layer may alsoinclude one or more explorers (e.g. zoning explorers) that may discoverzone names and attributes

Information about discovered SAN objects such as zones, hosts 102, HBAs284, HBA ports, storage devices 104, array ports, addressable units andLUNs 282 may be provided to the SAN manager 202 and the SAN managementserver 200 by the SAN access layer. The SAN management server 200 mayuse the provided information, for example, to configure collectors tocollect information on the discovered SAN objects. The SAN manager 202may use the provided information, as well as collected SAN data from theSAN management server 200, in one or more displays of SAN information.

The user may launch the LUN security utility 280 from the SAN manager202. The discovered SAN objects (e.g., zones, hosts 102, HBAs 284, HBAports, storage devices 104, array ports, addressable units and LUNs 282)provided to the SAN manager 202 by the SAN access layer and/or SANmanagement server 200 may be provided to the user in the LUN securityutility 280, and the user may locate and select from the objects whenconfiguring LUN security operations using the LUN security utility 280as described herein. As examples, array ports and addressable units maybe selected for binding to create LUNs 282, LUNs 282 may be located andselected, and hosts 102, HBAs 284 and/or HBA ports may be selected tomask to the LUNs 282; and zones may be created and/or selected to whichthe HBA 284 ports and LUNs 282 are to be added. After selecting the SANobjects to be operated upon using the LUN security utility 280, the LUNsecurity operations (e.g. binding, masking and zoning) may be performedas a single operation from the perspective of the user through the LUNsecurity utility 280.

The LUN security operations as specified by the user in the LUN securityutility 280 may be performed to establish device paths in the SAN. Inone embodiment, the SAN access layer may perform the LUN securityoperations (e.g. binding, masking and zoning) as specified by the userin the LUN security utility 280. In one embodiment, the SAN access layermay pass LUN security commands generated by the LUN security utility tothe disk arrays' 102 management interfaces for execution using the diskarray explorers. In one embodiment, the SAN access layer may pass LUNsecurity commands generated by the LUN security utility 280 to the hosts102 for execution using the HBA explorers. In one embodiment, the SANaccess layers may pass LUN security commands generated by the LUNsecurity utility 280 to the fabric devices for execution using thezoning explorers.

In one embodiment, the SAN management system may provide a LUN querytool, accessible, for example, from the SAN manager, that may be used tosearch for and find LUNs on the SAN that match one or more properties,such as device vendor, storage type, capacity, configuration, cost, andlocation. The LUN query tool may allow the user to further refine thesearch for LUNs based on the storage group(s) the LUNs are assigned toand/or on their accessibility from specified SAN-attached hosts 102. TheLUN query tool may return a list of all LUNs that meets thoserequirements. The LUN query tool may be used, for example, whenperforming LUN security operations (e.g. binding, masking and zoning)and when allocating storage to the requester. In one embodiment, afterusing the LUN Query Tool to generate a list of LUNs that match searchcriteria, the user may create or edit a LUN attribute and apply the newattribute value across multiple LUNs in-context from the LUN query tool.

CONCLUSION

Various embodiments may further include receiving, sending or storinginstructions and/or data implemented in accordance with the foregoingdescription upon a carrier medium. Generally speaking, a carrier mediummay include storage media or memory media such as magnetic or opticalmedia, e.g., disk or CD-ROM, volatile or non-volatile media such as RAM(e.g. SDRAM, DDR SDRAM, RDRAM, SRAM, etc.), ROM, etc. as well astransmission media or signals such as electrical, electromagnetic, ordigital signals, conveyed via a communication medium such as networkand/or a wireless link.

The various methods as illustrated in the Figures and described hereinrepresent exemplary embodiments of methods. The methods may beimplemented in software, hardware, or a combination thereof. The orderof method may be changed, and various elements may be added, reordered,combined, omitted, modified, etc.

Various modifications and changes may be made as would be obvious to aperson skilled in the art having the benefit of this disclosure. It isintended that the invention embrace all such modifications and changesand, accordingly, the above description to be regarded in anillustrative rather than a restrictive sense.

1. A storage area network (SAN), comprising: a host system comprising anapplication; a storage device comprising application data; a SAN fabriccomprising a plurality of components for coupling the host system to thestorage device; a system external to the SAN fabric configured toimplement a storage path monitor, wherein the storage path monitor isconfigured to: monitor performance metrics of a plurality of pathsthrough the SAN fabric from the host system to the storage device,wherein the application accesses the application data via one of theplurality of paths; generating historical performance data for each pathof said plurality of paths based on the performance metrics over aperiod of time; determine current quality of service of each of thepaths from the monitored performance metrics; determine another one ofthe plurality of paths predicted based on the historical performancedata to provide a higher quality of service than the one of theplurality of paths; and change the paths so that the applicationaccesses the application data via the other one of the plurality ofpaths predicted to perform above a quality of service threshold.
 2. TheSAN as recited in claim 1, wherein the paths are defined by zones on theSAN fabric, and wherein, to change the paths, the storage path monitoris further configured to reconfigure one or more zones so that theapplication accesses the application data via the other one of theplurality of paths.
 3. The SAN as recited in claim 1, wherein, tomonitor performance metrics of a plurality of paths through the SANfabric from the host system to the storage device, the storage pathmonitor is further configured to monitor one or more performance metricsof one or more components of each path.
 4. The SAN as recited in claim1, wherein, to determine another one of the plurality of paths predictedto provide a higher quality of service than the one of the plurality ofpaths, the storage path monitor is further configured to apply one ormore selection rules to the performance metrics of each of the monitoredpaths.
 5. The SAN as recited in claim 1, wherein, to change the paths sothat the application accesses the application data via the other one ofthe plurality of paths, the storage path monitor is further configuredto change the paths to the other one of the plurality of paths if theone of the plurality of paths is performing below a quality of servicelow threshold.
 6. The SAN as recited in claim 1, wherein, to change thepaths so that the application accesses the application data via theother one of the plurality of paths, the storage path monitor is furtherconfigured to change the paths to the other one of the plurality ofpaths if the one of the plurality of paths is performing below a qualityof service low threshold and the other one of the plurality of paths ispredicted to perform above a quality of service high threshold.
 7. TheSAN as recited in claim 1, wherein the system is further configured toimplement a SAN management system configured to discover components ofthe SAN and to collect information from the components, and wherein, tomonitor performance metrics of a plurality of paths through the SANfabric from the host system to the storage device, the storage pathmonitor is further configured to access the information collected fromthe components by the SAN management system.
 8. A storage area network(SAN), comprising: a host system comprising an application; a storagedevice comprising application data; a SAN fabric comprising a pluralityof components for coupling the host system to the storage device; asystem configured to implement a storage path monitor, wherein thestorage path monitor is configured to: monitor performance metrics of aplurality of paths through the SAN fabric from the host system to thestorage device, wherein the application accesses the application datavia one of the plurality of paths; determine an interval for whichquality of service on the one of the plurality of paths is predicted tobe below a quality of service threshold from the monitored performancemetrics; determine another one of the plurality of paths predicted toprovide a higher quality of service for the interval than the one of theplurality of paths; and change the paths prior to the interval so thatthe application accesses the application data via the other one of theplurality of paths during the interval.
 9. The SAN as recited in claim8, wherein the paths are defined by zones on the SAN fabric, andwherein, to change the paths, the storage path monitor is furtherconfigured to reconfigure one or more zones so that the applicationaccesses the application data via the other one of the plurality ofpaths.
 10. A storage path monitor system, comprising: a processor; and amemory storing program instructions, wherein the program instructionsare executable by the processor to perform the steps of: monitorperformance metrics of a plurality of paths through a fabric of astorage area network (SAN) from a host system to a storage device,wherein an application on the host system accesses application data onthe storage device via one of the plurality of paths; generatinghistorical performance data for each path of said plurality of pathsbased on the performance metrics over a period of time; determinecurrent quality of service of each of the paths from the monitoredperformance metrics; determine another one of the plurality of pathspredicted based on the historical performance data to provide a higherquality of service than the one of the plurality of paths; and changethe paths so that the application accesses the application data via theother one of the plurality of paths predicted to perform above a qualityof service threshold; wherein the storage path monitor system isexternal to said fabric.
 11. The system as recited in claim 10, whereinthe paths are defined by zones on the fabric, and wherein, to change thepaths, the program instructions are further executable by the processorto reconfigure one or more zones so that the application accesses theapplication data via the other one of the plurality of paths.
 12. Thesystem as recited in claim 10, wherein, to determine another one of theplurality of paths predicted to provide a higher quality of service thanthe one of the plurality of paths, the program instructions are furtherexecutable by the processor to apply one or more selection rules to theperformance metrics of each of the monitored paths.
 13. The system asrecited in claim 10, wherein the system is further configured toimplement a SAN management system configured to discover components ofthe SAN and to collect information from the components, and wherein, tomonitor performance metrics of a plurality of paths through the SANfabric from the host system to the storage device, the storage pathmonitor is further configured to access the information collected fromthe components by the SAN management system.
 14. A system, comprising:means for monitoring performance metrics of a plurality of paths througha fabric of a storage area network (SAN) from a host system to a storagedevice, wherein an application on the host system accesses applicationdata on the storage device via one of the plurality of paths, whereinsaid monitoring comprises monitoring performance metrics from a locationexternal to the fabric; means for generating historical performance datafor each path of said plurality of paths based on the performancemetrics over a period of time; means for determining current quality ofservice of each of the paths from the monitored performance metrics;means for determining another one of the plurality of paths predictedbased on the historical data to provide a higher quality of service thanthe one of the plurality of paths; and means for changing the paths sothat the application accesses the application data via the other one ofthe plurality of paths predicted to perform above a quality of servicethreshold.
 15. A method, comprising: monitoring performance metrics of aplurality of paths through a fabric of a storage area network (SAN) froma host system to a storage device, wherein an application on the hostsystem accesses application data on the storage device via one of theplurality of paths, wherein said monitoring comprises monitoringperformance metrics from a location external to the fabric; generatinghistorical performance data for each path of said plurality of pathsbased on the performance metrics over a period of time; determiningcurrent quality of service of each of the paths from the monitoredperformance metrics; determining another one of the plurality of pathspredicted based on the historical data to provide a higher quality ofservice than the one of the plurality of paths; and changing the pathsso that the application accesses the application data via the other oneof the plurality of paths predicted to perform above a quality ofservice threshold.
 16. The method as recited in claim 15, wherein thepaths are defined by zones on the fabric, and wherein said changing thepaths comprises reconfiguring one or more zones so that the applicationaccesses the application data via the other one of the plurality ofpaths.
 17. The method as recited in claim 15, wherein said determininganother one of the plurality of paths predicted to provide a higherquality of service than the one of the plurality of paths comprisesapplying one or more selection rules to the performance metrics of eachof the monitored paths.
 18. The method as recited in claim 15, whereinone or more host systems of the SAN implement a SAN management system,the method further comprising: the SAN management system discoveringcomponents of the SAN and collecting information from the components;and wherein said monitoring performance metrics of a plurality of pathsthrough the SAN fabric from the host system to the storage devicecomprises accessing the information collected from the components by theSAN management system.
 19. A computer storage medium storing programinstructions, wherein the program instructions are executed to performthe steps of: monitoring performance metrics of a plurality of pathsthrough a fabric of a storage area network (SAN) from a host system to astorage device, wherein an application on the host system accessesapplication data on the storage device via one of the plurality ofpaths, wherein said monitoring comprises monitoring said performancemetrics from a location external to the fabric; generating historicalperformance data for each path of said plurality of paths based on theperformance metrics over a period of time; determining current qualityof service of each of the paths from the monitored performance metrics;determining another one of the plurality of paths predicted to provide ahigher quality of service than the one of the plurality of paths; andchanging the paths so that the application accesses the application datavia the other one of the plurality of paths predicted to perform above aquality of service threshold.
 20. The computer-accessible medium asrecited in claim 19, wherein the paths are defined by zones on thefabric, and wherein, in said changing the paths, the programinstructions are further configured to implement reconfiguring one ormore zones so that the application accesses the application data via theother one of the plurality of paths.
 21. The computer-accessible mediumas recited in claim 19, wherein, in said determining another one of theplurality of paths predicted to provide a higher quality of service thanthe one of the plurality of paths, the program instructions are furtherconfigured to implement applying one or more selection rules to theperformance metrics of each of the monitored paths.
 22. Thecomputer-accessible medium as recited in claim 19, wherein one or morehost systems of the SAN implement a SAN management system configured todiscover components of the SAN and collect information from the SANcomponents, wherein, in said monitoring performance metrics of aplurality of paths through the SAN fabric from the host system to thestorage devices, the program instructions are further configured toimplement accessing the information collected from the components by theSAN management system.